An enhanced classification method comprising a genetic algorithm, rough set theory and a modified PBMF-index function

نویسنده

  • Kuang Yu Huang
چکیده

This study proposes a method, designated as the GRP-index method, for the classification of continuous value datasets in which the instances do not provide any class information and may be imprecise and uncertain. The proposed method discretizes the values of the individual attributes within the dataset and achieves both the optimal number of clusters and the optimal classification accuracy. The proposed method consists of a genetic algorithm (GA) and an FRP-index method. In the FRP-index method, the conditional and decision attribute values of the instances in the dataset are fuzzified and discretized using the Fuzzy C-means (FCM) method in accordance with the cluster vectors given by the GA specifying the number of clusters per attribute. Rough set (RS) theory is then applied to determine the lower and upper approximate sets associated with each cluster of the decision attribute. The accuracy of approximation of each cluster of the decision attribute is then computed as the cardinality ratio of the lower approximate sets to the upper approximate sets. Finally, the centroids of the lower approximate sets associated with each cluster of the decision attribute are determined by computing the mean conditional and decision attribute values of all the instances within the corresponding sets. The cluster centroids and accuracy of approximation are then processed by a modified form of the PBMF-index function, designated as the RP-index function, in order to determine the optimality of the discretization/classification results. In the event that the termination criteria are not satisfied, the GA modifies the initial population of cluster vectors and the FCM, RS and RP-index function procedures are repeated. The entire process is repeated iteratively until the termination criteria are satisfied. The maximum value of the RP cluster validity index is then identified, and the corresponding cluster vector is taken as the optimal classification result. The validity of the proposed approach is confirmed by cross validation, and by comparing the classification results obtained for a typical stock market dataset with those obtained by non-supervised and pseudosupervised classification methods. The results show that the proposed GRP-index method not only has a better discretization performance than the considered methods, but also achieves a better accuracy of approximation, and therefore provides a more reliable basis for the extraction of decision-making rules. © 2011 Elsevier B.V. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluation of Rough Set Theory for Decision Making of rehabilitation Method for Concrete Pavement

In recent years a great number of advanced theoretical - empirical methods has been developed for design & modeling concrete pavements distress. But there is no reliable theoretical method to be use in evaluation of conerete pavements distresses and making a decision about repairing them. Only empirical methods is used for this reason. One of the most usual methods in evaluating concrete paveme...

متن کامل

A Hybrid Approach to Continuous Valued Datasets Classifying based on Particle Swarm Optimization, Variable Precision Rough Set Theory and Modified Huang-index Function

This paper proposed a new hybrid method, designated as PSOVPRS-index method, for partitioning and classifying continuous valued datasets based on particle swarm optimization (PSO) algorithm, Variable Precision Rough Set (VPRS) theory and a modified form of the Huang-index function. In contrast to the Huang-based index method which simply assigns a constant number of clusters to each attribute a...

متن کامل

Topological structure on generalized approximation space related to n-arry relation

Classical structure of rough set theory was first formulated by Z. Pawlak in [6]. The foundation of its object classification is an equivalence binary relation and equivalence classes. The upper and lower approximation operations are two core notions in rough set theory. They can also be seenas a closure operator and an interior operator of the topology induced by an equivalence relation on a u...

متن کامل

Combination of Feature Selection and Learning Methods for IoT Data Fusion

In this paper, we propose five data fusion schemes for the Internet of Things (IoT) scenario,which are Relief and Perceptron (Re-P), Relief and Genetic Algorithm Particle Swarm Optimization (Re-GAPSO), Genetic Algorithm and Artificial Neural Network (GA-ANN), Rough and Perceptron (Ro-P)and Rough and GAPSO (Ro-GAPSO). All the schemes consist of four stages, including preprocessingthe data set ba...

متن کامل

A hybrid filter-based feature selection method via hesitant fuzzy and rough sets concepts

High dimensional microarray datasets are difficult to classify since they have many features with small number ofinstances and imbalanced distribution of classes. This paper proposes a filter-based feature selection method to improvethe classification performance of microarray datasets by selecting the significant features. Combining the concepts ofrough sets, weighted rough set, fuzzy rough se...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Appl. Soft Comput.

دوره 12  شماره 

صفحات  -

تاریخ انتشار 2012